perf[fsst]: like pushdown using a dfa#6935
Conversation
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Merging this PR will improve performance by ×4.3
Performance Changes
Comparing Footnotes
|
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
ea9d592 to
f86e309
Compare
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Polar Signals Profiling ResultsLatest Run
Previous Runs (4)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.029x ➖ datafusion / vortex-file-compressed (1.029x ➖, 0↑ 1↓)
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.058x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.061x ➖, 0↑ 1↓)
datafusion / parquet (1.048x ➖, 0↑ 1↓)
datafusion / arrow (1.080x ➖, 1↑ 8↓)
duckdb / vortex-file-compressed (1.071x ➖, 0↑ 8↓)
duckdb / vortex-compact (1.051x ➖, 0↑ 3↓)
duckdb / parquet (1.024x ➖, 2↑ 3↓)
duckdb / duckdb (1.044x ➖, 0↑ 1↓)
Full attributed analysis
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.145x ❌, 0↑ 5↓)
datafusion / vortex-compact (1.010x ➖, 0↑ 0↓)
datafusion / parquet (0.991x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.111x ❌, 0↑ 4↓)
duckdb / vortex-compact (0.991x ➖, 0↑ 0↓)
duckdb / parquet (0.992x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.953x ➖, 5↑ 0↓)
datafusion / vortex-compact (0.956x ➖, 2↑ 0↓)
datafusion / parquet (0.964x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.952x ➖, 10↑ 0↓)
duckdb / vortex-compact (0.972x ➖, 1↑ 0↓)
duckdb / parquet (0.973x ➖, 5↑ 1↓)
duckdb / duckdb (0.983x ➖, 4↑ 6↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.009x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.007x ➖, 0↑ 0↓)
datafusion / parquet (1.018x ➖, 0↑ 1↓)
datafusion / arrow (1.018x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.006x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.007x ➖, 0↑ 0↓)
duckdb / parquet (1.012x ➖, 0↑ 1↓)
duckdb / duckdb (0.999x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.840x ➖, 2↑ 0↓)
datafusion / vortex-compact (0.956x ➖, 1↑ 1↓)
datafusion / parquet (0.882x ➖, 4↑ 0↓)
duckdb / vortex-file-compressed (1.007x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.050x ➖, 0↑ 0↓)
duckdb / parquet (0.959x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.035x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.880x ➖, 2↑ 0↓)
datafusion / parquet (0.915x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.989x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.963x ➖, 0↑ 0↓)
duckdb / parquet (1.002x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.001x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.021x ➖, 0↑ 0↓)
duckdb / parquet (0.992x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.020x ➖, 0↑ 3↓)
datafusion / parquet (1.021x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.978x ➖, 5↑ 0↓)
duckdb / parquet (1.006x ➖, 0↑ 0↓)
duckdb / duckdb (0.980x ➖, 2↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.990x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.916x ➖, 1↑ 0↓)
datafusion / parquet (0.925x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.989x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.050x ➖, 0↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 1↓)
Full attributed analysis
|
Benchmarks: Random AccessVortex (geomean): 0.963x ➖ unknown / unknown (1.044x ➖, 6↑ 17↓)
|
Benchmarks: CompressionVortex (geomean): 1.010x ➖ unknown / unknown (1.011x ➖, 2↑ 3↓)
|
0ax1
left a comment
There was a problem hiding this comment.
Initial first pass. Doing a second in depth round.
Fsst
likeexecution without decompression.This uses a DFA over the symbol table and the like expression.
Once this is proved out we could think about putting this in fsst-rs?